3 research outputs found

    Visualization of Data by Method of Elastic Maps and Its Applications in Genomics, Economics and Sociology

    Get PDF
    Technology of data visualization and data modeling is suggested. The basic of the technology is original idea of elastic net and methods of its construction and application. A short review of relevant methods has been made. The methods proposed are illustrated by applying them to the real economical, sociological and biological datasets and to some model data distributions. The basic of the technology is original idea of elastic net - regular point approximation of some manifold that is put into the multidimensional space and has in a certain sense minimal energy. This manifold is an analogue of principal surface and serves as non-linear screen on what multidimensional data are projected. Remarkable feature of the technology is its ability to work with and to fill gaps in data tables. Gaps are unknown or unreliable values of some features. It gives a possibility to predict plausibly values of unknown features by values of other ones. So it provides technology of constructing different prognosis systems and non-linear regressions. The technology can be used by specialists in different fields. There are several examples of applying the method presented in the end of this paper

    Seven clusters in genomic triplet distributions

    Get PDF
    Motivation: In several recent papers new algorithms were proposed for detecting coding regions without requiring learning dataset of already known genes. In this paper we studied cluster structure of several genomes in the space of codon usage. This allowed to interpret some of the results obtained in other studies and propose a simpler method, which is, nevertheless, fully functional. Results: Several complete genomic sequences were analyzed, using visualization of tables of triplet counts in a sliding window. The distribution of 64-dimensional vectors of triplet frequencies displays a well-detectable cluster structure. The structure was found to consist of seven clusters, corresponding to protein-coding information in three possible phases in one of the two complementary strands and in the non-coding regions. Awareness of the existence of this structure allows development of methods for the segmentation of sequences into regions with the same coding phase and non-coding regions. This method may be completely unsupervised or use some external information. Since the method does not need extraction of ORFs, it can be applied even for unassembled genomes. Accuracy calculated on the base-pair level (both sensitivity and specificity) exceeds 90%. This is not worse as compared to such methods as HMM, however, has the advantage to be much simpler and clear

    Constructive Methods of Invariant Manifolds for Kinetic Problems

    Get PDF
    We present the Constructive Methods of Invariant Manifolds for model reduction in physical and chemical kinetics, developed during last two decades. The physical problem of reduced description is studied in a most general form as a problem of constructing the slow invariant manifold. The invariance conditions are formulated as the differential equation for a manifold immersed in the phase space (the invariance equation). The equation of motion for immersed manifolds is obtained (the film extension of the dynamics). Invariant manifolds are fixed points for this equation, and slow invariant manifolds are Lyapunov stable fixed points, thus slowness is presented as stability. A collection of methods for construction of slow invariant manifolds is presented, in particular, the Newton method subject to incomplete linearization is the analogue of KAM methods for dissipative systems. The systematic use of thermodynamics structures and of the quasi--chemical representation allow to construct approximations which are in concordance with physical restrictions. We systematically consider a discrete analogue of the slow (stable) positively invariant manifolds for dissipative systems, invariant grids. Dynamic and static postprocessing procedures give us the opportunity to estimate the accuracy of obtained approximations, and to improve this accuracy significantly. The following examples of applications are presented: Nonperturbative deviation of physically consistent hydrodynamics from the Boltzmann equation and from the reversible dynamics, for Knudsen numbers Kn~1; construction of the moment equations for nonequilibrium media and their dynamical correction (instead of extension of list of variables) to gain more accuracy in description of highly nonequilibrium flows; determination of molecules dimension (as diameters of equivalent hard spheres) from experimental viscosity data; invariant grids for a two-dimensional catalytic reaction and a four-dimensional oxidation reaction (six species, two balances); universal continuous media description of dilute polymeric solution; the limits of macroscopic description for polymer molecules, etc
    corecore